A Load Balancing Package on DistributedMemory Systems and its Application

نویسندگان

  • X. Yuan
  • C. Salisbury
  • D. Balsara
چکیده

We present a tool, Bisect, for balanced decomposition of spatial domains. In addition to applying a nested bisection algorithm to determine the boundaries of each subdomain, Bisect replicates a user speciied zone along the boundaries of the subdomain in order to minimize future interactions between subdomains. Results of running the tool on the Cray T3D system using both shared memory operations and MPI communications are reported and discussed. In addition, Bisect is used in a parallel implementation of a particle{particle/particle{mesh (P3M) simulation program on the Cray T3D system. The performance of the P3M program with diierent load{balancing criteria is evaluated and compared. The results show that the use of the Bisect package balances the load eeciently and minimizes communication on the T3D massively parallel system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Load Balancing Strategies for Distributedmemory

Load balancing in large parallel systems with distributed memory is a diicult task often innuencing the overall eeciency of applications substantially. A number of eecient distributed load balancing strategies have been developed in the recent years. Although they are currently not generally available as part of parallel operating systems, it is often not diicult to integrate them into applicat...

متن کامل

Performance Analysis and Optimization on a Parallel Atmospheric General Circulation Model Code

An analysis is presented of the primary factors influencing the performance of a parallel implementation of the UCLA atmospheric general circulation model (AGCM) on distributedmemory, massively parallel computer systems. Several modifications to the original parallel AGCM code aimed at improving its numerical efficiency, load-balance and single-node code performance are discussed. The impact of...

متن کامل

Load Balancing Problem for Parallel Computers with Distributed Memory

This paper deals with load balancing of parallel algorithms for distributedmemory computers. The parallel versions of BLAS subroutines for matrix-vector product and LU factorization are considered. Two task partitioning algorithms are investigated and speed-ups are calculated. The cases of homogeneous and heterogeneous collections of computers/processors are studied, and special partitioning al...

متن کامل

Enabling and Scaling Matrix Computations on Heterogeneous Multi-Core and Multi-GPU Systems

We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous multicore and multi-GPU systems to support dense matrix computations efficiently. The main idea is that we treat a heterogeneous system as a distributedmemory machine, and use a heterogeneous multi-level block cyclic distribution method to allocate data to the host and multiple GPUs to minimize communication. We ...

متن کامل

Parleda: a Library for Parallel Processing in Computational Geometry Applications

ParLeda is a software library that provides the basic primitives needed for parallel implementation of computational geometry applications. It can also be used in implementing a parallel application that uses geometric data structures. The parallel model that we use is based on a new heterogeneous parallel model named HBSP, which is based on BSP and is introduced here. ParLeda uses two main lib...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997